Skip to main content

Full text of "USPTO Patents Application 10071393"

See other formats


This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the 
original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 

• BLACK BORDERS 

• TEXT CUT OFF AT TOP, BOTTOM OR SIDES 

• FADED TEXT 

• ILLEGIBLE TEXT 

• SKEWED/SLANTED IMAGES 

• COLORED PHOTOS 

• BLACK OR VERY BLACK AND WHITE DARK PHOTOS 

• GRAY SCALE DOCUMENTS 

IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problems Mailbox. 



U5 

PH 



(19) 



(12) 



(43) Date of publication: 

02.06.1999 Bulletin 1999/22 

(21) Application number: 98122162.5 

(22) Date of filing: 25.1 1 .1 998 



EuropSisches Patentamt 
European Patent Office 

Office europ6en des brevets (11) 

EUROPEAN PATENT APPLICATION 

(51) Int. CI. 6 : G06F3/00 



1 1- i r -v 



I 



DOSSIER 



111 

EP 0 919 906 A2 



(84) Designated Contracting States: 


(72) 


Inventors: 


AT BE CH CY DE DK ES F! FR GB GR IE IT LI LU 


• 


Imagawa, Taro 


MCNLPTSE 




Hirakata-shi, Osaka 573-0071 (JP) 


Designated Extension States: 


• 


Kamei, Michiyo 


ALLTLVMKROSI 




Hirakata-shi, Osaka 573-0165 (JP) 




• 


Mekata, Tsuyoshi 


(30) Priority: 27.11.1997 JP 325739/97 




Katano-shi, Osaka 576-0052 (JP) 


(71) Applicant: 


(74) 


Representative: 


MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD. 




Grunecker, Kinkeldey, 


Kadoma-shi, Osaka 571-8501 (JP) 




Stockmair & Schwanhdusser 






Anwaltssozietdt 






Maximilianstrasse 58 






80538 Munchen (DE) 



3! 

CD 

o> 

5> 

Q. 
LU 



(54) Control method 

(57) A control method is to monitor a person's 
attributes and based on the results, in predetermined 
content of the control, to control equipment to be con- 
trolled, further, to monitor said person's peripheral envi- 
ronment and also by using these results to execute said 
control. 



Environ- 
ment 




object 
c and 1 rin to 
eninatioa 
&MB 3 



Control object 
and content, 
determination 
section 5 




Printed by Xerox (UK) 

2.16.7/3.6 



1 



EP 0 919 906 A2 



2 



Description 

BACKGROUND OF THE INVENTION 
1 .Field of the Invention 

[0001 ] The present invention relates to a technique for 
operating equipment, manipulating information, or con- 
trolling environments based on people's motions, pos- 
tures, and conditions. 

2. Related Art of the Invention 

[0002] Certain conventional techniques for detecting 
people's motions to operate equipment recognizes peo- 
pled gestures to operate televisions. (Japanese Patent 
Applications Laid Open No. 8-315154 and No. 8- 
21 1979). Japanese Patent Application Laid Open No. 8- 
31 51 54 uses a camera to detect the position of the palm 
of a person's hand and his or her gesture in order to 
operate a television. 

[0003] Japanese Patent Application Laid Open No. 8- 
21 1 979 uses the position and shape of a person's hand 
detected by a camera to input characters to a portable 
personal computer. 

[0004] These conventional approaches, however, fun- 
damentally requires a person and an apparatus oper- 
ated by the person to correspond on a one-to-one basis, 
and desirable operations are difficult to perform if there 
are multiple televisions or personal computers near the 
person or if there are multiple operators. 
[0005] In general, there are often multiple appara- 
tuses and people in a house or an office or out of doors, 
so if this apparatus is controlled using people's motions, 
the people must be individually associated with the 
apparatuses. If, for example, multiple televisions are 
simultaneously operated, the conventional approaches 
do not allow a television to be operated or being oper- 
ated to be distinguished. In addition, if there are several 
people in the room, the conventional approaches can- 
not determine who is changing the television channels 
or who can change the channels. 

SUMMARY OF THE INVENTION 

[0006] In view of these problems of the conventional 
apparatus, it is an object of this invention to provide a 
control method that can determine, despite the pres- 
ence of multiple apparatuses and people in the neigh- 
borhood, the correspondence between the apparatuses 
and people to smoothly operate the apparatuses using 
the people's motions, postures, and conditions. 
[0007] This invention provides a control method char- 
acterized in that the attributes of one or several people 
are continuously or intermittently monitored to control 
predetermined equipment based on the detection of the 
people's predetermined attribute. This invention also 
provides a control method characterized in that candi- 



dates for a control object and the content of control are 
determined based on the people's predetermined 
attribute and in that a control object and the content of 
control are determined based on the candidates for a 

s control object and the content of control. Furthermore, 
this invention provides a control method characterized 
in that based on the detection of the attributes of the 
several people, candidates for a control object and the 
content of control are determined for each of the people 

10 and in that a control object and the content of control are 
determined based on the candidates for a control object 
and the content of control. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 

[0008] 

FIG. 1 is a block diagram showing a first embodi- 
ment of this invention. 
20 FIG. 2 is a block diagram showing a second embod- 
iment of this invention. 

FIG. 3 is a block diagram showing a third embodi- 
ment of this invention. 

FIG. 4 is a block diagram showing a fourth embodi- 
es ment of this invention. 

FIG. 5 is a block diagram showing a fifth embodi- 
ment of this invention. 

FIG. 6 is a block diagram showing a sixth embodi- 
ment of this invention. 
30 FIG. 7 is a block diagram showing a seventh 

embodiment of this invention. 

FIG. 8 is a block diagram showing an eighth 

embodiment of this invention. 

FIG.9 is illustrations showing various motions of 
35 people' hand or finger. 

FIG. 10 is illustrations showing various motions of 

people* hand or finger. 

DETAILED DESCRIPTION OF THE PREFERRED 
40 EMBODIMENTS 

[0009] Embodiments of this invention are described 
below with reference to the drawings. FIG. 1 is a block 
diagram showing a first embodiment of this invention. In 

45 this figure, 1 is a monitoring section, 2 is an operator 
selection section, 3 is a control object candidate deter- 
mination section, 4 is a control content candidate deter- 
mination section, and 5 is a control object and content 
determination section. 

so [0010] In FIG. 1 the monitoring section 1 continuously 
monitors people's attributes and their peripheral envi- 
ronment. The people's attributes include people's posi- 
tions, postures, faces, expressions, eyes or head 
direction, motions, voices, physiological conditions, 

55 identities, forms, weights, sexes, ages, physical and 
mental handicaps, and belongings. The physical and 
mental handicaps include visual, physical, vocal, and 
auditory handicaps and the disability to understand lan- 



2 



BNSDOCID:<EP 0919906A2 I > 



3 



EP 0 919 906 A2 



4 



guage. The belongings includes clothes, caps, glasses, 
bags, and shoes. 

[001 1 ] The monitoring means include a camera (that 
is sensitive to visible light or infrared rays), a micro- 
phone, a pressure sensor, a supersonic sensor, a vibra- 
tion sensor, a chemical sensor, and a photosensor. 
Other sensors may be used. The camera can be used 
to monitor peopled positions, postures, faces, expres- 
sions, motions, forms, and belongings in a non-contact 
manner. 

[001 2] If a person's position is monitored by a camera, 
a position at which that person is present is assumed to 
be an area of an image in which a flesh color is present. 
The position at which a person is present may be a area 
of an image including a different color or illuminance or 
a area of an image in which infrared rays of wavelength 
3 to 5 |im or 8 to 12 jim emitted mostly by people are 
detected if infrared images are also used. If the person's 
posture, form, or motion is monitored by the camera, a 
method similar to that for detecting the person's position 
is used to extract the person's rough shape in order to 
monitor his or her posture or form based on the shape 
while monitoring the temporal change in posture to 
monitor his or her motion. 

[001 3] To determine, for example, whether the person 
is standing or sitting or in which direction he or she is 
reaching out, representative shapes for the respective 
postures can be registered beforehand and compared 
with the person's image. If the person's face or expres- 
sion or the direction of his or her head or eyes is moni- 
tored by the camera, the head located at the top of his 
or her body can be detected based on the above shape, 
and his or her expression can be compared with regis- 
tered images of his or her faces including various 
expressions in order to monitor his or her face and 
expressions. The direction of the head or eyes can be 
determined by detecting the positions of the eyes from 
the image of the head detected using the above proce- 
dure. If the positions of the eyes are symmetrical about 
the head, the head can be determined to face frontward 
relative to the camera, whereas if the eyes are biased to 
the right or left, the head can be determined to face 
rightward or leftward. The positions of the eyes can be 
detected by detecting an elliptical or flat area of the face 
which is darker within the face. Moreover, the directions 
of the eyes can be detected by detecting the circular 
area of the iris at the center of the overall eye and deter- 
mining the offset between the center of the circular area 
and the center of the overall area of the eye. If the per- 
son's belonging is detected by the camera, it can be 
determined for the person's area detected using the 
above procedure that there is a belonging in a portion of 
the body in which a color is detected that differs from the 
person's color detected when he or she wears no 
clothes. Determinations can also be made by register- 
ing beforehand belongings having particular colors. The 
glasses can be identified by determined whether there 
is a shape of a frame around the eyes when the posi- 



tions of the face and eyes are detected using the above 
procedure. In addition, a microphone can be used to 
monitor in a non-contact manner the person's voices or 
sound generated by his or her motion (sound produced 

5 when the person claps his or her hands or snaps his or 
her fingers or footsteps). In addition, the physiological 
condition such as the sound of a heatbeat can be meas- 
ured from a close or contact position. A pressure sensor 
can be used to monitor contacts associated with the 

w person's motion or can be installed on the floor surface 
to monitor his or her weight or walking pattern. In addi- 
tion, a supersonic sensor can be used to monitor the 
distance to the person or his or her motion based on a 
change in distance to him or her. The use of supersonic 

is waves enables the person's position to be monitored 
even if lighting noticeably varies or there is no lighting. 
[0014] In addition, a vibration sensor can be used to 
monitor vibration generated by the person's motion. The 
physiological conditions can be monitored using a 

20 chemical sensor for measuring the quantities of chemi- 
cal substances such as the concentration of ions or the 
quantity of sugar or hormone in secretion, excreta, or 
body fluids, or a photosensor for measuring a spectrum 
distribution of light transmitting a body. In addition, 

25 based on information such as the features of the face or 
the motion of the body obtained by the camera, the fea- 
tures of the voice obtained by the microphone, or the 
weight obtained by the pressure sensor, the person's 
identity, sex, age, or physical and mental handicap can 

30 be assumed. For example, if the features of the face of 
a particular person, and his or her weight, form, sex, 
age, and physical and mental handicap are registered 
beforehand and if one of the features (for example, 
weight) is used to identify this person, the other features 

35 such as sex are known. 

[001 5] In addition, multiple cameras of different views 
can be used to three-dimensionally determine the per- 
son's position, posture, and motion, to improve the mon- 
itoring accuracy, and to extend the monitoring range. 

40 Likewise, multiple sensors of different natures can be 
combined together to improve the accuracy and reliabil- 
ity of the detection of the person's position, posture, or 
motion. 

[001 6] People and their environment may be intermit- 
45 terrtly monitored, and when the people's attributes do 
not significantly vary or the control object does not 
require fast control, the intermittent monitoring can 
reduce the throughput of the series of control operations 
and thus calculation resources and energy consump- 
so tion. 

[0017] In FIG. 1, the operator selection section 2 
selects an operator based on the results of monitoring 
by the monitoring section 1 . The monitoring section is 
assumed to monitor three people. The operator selec- 
55 tion section 2 selects one from the three people based 
on a people's predetermined attribute monitored by the 
monitoring section 1 . The selection based on the prede- 
termined attribute refers to the selection of a person 



3 



QMcnnnn- ^tzo aqioqaaao i 



5 



EP 0 919 906 A2 



6 



closest to a predetermined position (for example, the 
center of a room), a person who has assumed a prede- 
termined posture or has made a predetermined motion 
(for example, raising his or her hand), a person of the 
top priority based on predetermined people's priorities 
(for example, within a family, the order of father, mother, 
and children) (the priorities can be given based on 
weight, sex, age, physiological condition, or physical 
and mental handicap), a person who has spoken a par- 
ticular word (for example, "yes" or the name of an appa- 
ratus), a person having a particular belonging (for 
example, a red ball in his or her hand), or a person 
directing his or her eyes or head toward a particular 
position (for example, an ornament). In addition, the 
operator selection section 2 may determine an evalua- 
tion value for each person based on an evaluation 
method predetermined based on people's attributes in 
order to select a person having an evaluation value that 
is larger than a reference value and that is also the larg- 
est. In this case, the attributes of various people can be 
taken into consideration. The evaluation method may 
use the weighted sum of the strength of a voice given 
when the person says "yes" and the speed at which the 
person raises his or her hand. 

[0018] Next, the operator selection section 2 presents 
information on the selected person. If, for example, the 
name of the selected person can be identified, it may be 
displayed on a display screen or output as a voice, or 
music, sound, or an uttered name associated with the 
person beforehand may be output, or a symbol or char- 
acters associated with the person beforehand may be 
displayed on the display screen, or a signal may be 
transmitted to a device carried by the person. The 
device carried by the person can provide the information 
to the person by vibrating or outputting light or sound. 
Alternatively, a light may be directed to the selected per- 
son, or a device such as a display may be rotated and 
directed to the selected person, or an image of the 
selected person photographed by a camera may be dis- 
played on the display screen. A voice saying "What do 
you want?", or sound or light may be output to only the 
selected person immediately after his or her utterance 
or motion in order to inform the people of the selected 
person. 

[0019] Although, in the above example, the operator 
selection section 2 selects one from the three people, 
several (for example, 2) people may be selected or no 
person may be selected if the predetermined attribute 
cannot be monitored. 

[0020] In FIG. 1 , 3 is the control object candidate 
determination section for determining candidates for a 
control object based on the predetermined attribute of 
the person selected by the operator selection section 2 
and his or her peripheral environment. The control 
object may include equipment (inside a house, an air 
conditioner, a television, a video, a light, a washer, a 
personal computer, a game machine, or a pet robot; 
outside a house, an elevator or a car) or information or 



the contents of display used in information equipment 
(characters or graphics displayed on a display screen). 
The predetermined attribute of the selected person may 
include an indicating posture with his or her finer, gaze, 

5 or head, the utterance of particular words, sign lan- 
guage, or the holding of a particular article. If the indi- 
cating posture with the person's finger, gaze, or head is 
used, candidates for a control object will be equipment 
near the indicated position or the contents of display on 

to the display screen. The peripheral environment may 
include temperature, humidity, illuminance, sound vol- 
ume, the condition of air currents, the concentration of a 
particular gas (carbon dioxide) in the air, or time. If the 
indoor temperature or humidity is high, the control 

is object may be an air conditioner, a fan, or a dehumidi- 
f ier. If the indoor illuminance is low or a particular time 
(before sunset) has come, the control object may be 
lighting. If the air current does not vary over a long 
period of time or the concentration of carbon dioxide in 

20 the air exceeds the reference value, the control object 
may be a fan or window. If the outdoor sound exceeds 
the reference value, the control object may be a televi- 
sion or a window. 

[0021 ] There may be one or several candidates. If, for 

25 example, the person selected as the operator is pointing 
toward an air conditioner and a television, both of them 
are to be controlled. If the utterance of words or sign 
language is used as a predetermined attribute, the 
name of the apparatus may be indicated or predeter- 

30 mined words may be uttered. For example, the uttered 
word "television" allows a television (or televisions) to be 
used as a candidate for a control object, and the uttered 
word "hot" allows a fan and an air conditioner to be used 
as candidates for control objects. If the holding of a par- 

35 ticular article is used as a predetermined attribute, 
equipment and articles may be mutually associated 
before hand. For example, by associating a red ball with 
the air conditioner and associating a blue ball with the 
television, the holding of the blue ball allows the televi- 

40 sion to be used as a candidate for a control object. 

[0022] Next, the control object candidate determina- 
tion section 3 externally presents information on deter- 
mined candidates for control objects. To present 
information, a name indicating a control object may be 

45 output as a voice or sound or a name for a control object 
may be displayed on the display screen. Alternatively, a 
light included in an apparatus that is a candidate for a 
control object may be turned on or a speaker included in 
an apparatus that is a candidate for a control object may 

so output a sound or voice. 

[0023] In FIG. 1 , the control content candidate deter- 
mination section 4 determines candidates for the con- 
tent of control based on the predetermined attribute of 
the person selected by the operator selection section 2 

55 and his or her peripheral environment. The content of 
control may include switching on and off equipment, 
changing operation parameters (sound volume, wind 
quantity or direction, channels, or light quantity) of the 



4 



:<EP 0919906A2 I > 



7 



EP0919906 A2 



8 



equipment, opening and closing a door or window, mov- 
ing or modifying an object on the display of information 
equipment, changing the color of an object, or editing a 
document. 

[0024] The predetermined attribute may include the 
use of a voice or sign language to indicate the content 
of control (for example, the utterance of the word "on" 
for switching on, the utterance of the word "off' for 
switching off, or the utterance of the word "hot" for 
switching on) or predetermined motions may be associ- 
ated with the content of control beforehand (clapping 
hands once for switching on and clapping hands twice 
for switching off). There may be one or several candi- 
dates. If the person selected as the operator utters the 
word "up", the candidate may be an increase in the vol- 
ume of the television or the set temperature for the air 
conditioner, or the moving to information of an object 
displayed on the display of information equipment 
[0025] The peripheral environment may include tem- 
perature, humidity, illuminance, sound volume, the con- 
dition of air currents, the concentration of a particular 
gas (carbon dioxide) in the air, or time. If the indoor tem- 
perature or humidity is high, candidates for content of 
the control may be the switching-on and increasing 
operation force of an air conditioner, a fan, or a dehu- 
midifier. 

[0026] If the indoor illuminance is low or a particular 
time (before sunset) has come, the candidates for the 
content of control may be the switching-on of lighting or 
increasing light quantity. If the air current does not vary 
over a long period of time or the concentration of carbon 
dioxide in the air exceeds the reference value, candi- 
dates for the content of control may be the switching-on 
of the fan or the opening of the window. If the outdoor 
sound exceeds the reference value, candidates for the 
content of control may be an increase in the sound vol- 
ume of the television or the closing of the window. 
[0027] Next, the control content candidate determina- 
tion section 4 presents information on the determined 
candidates for the content of control. To present infor- 
mation, a name indicating the content of control may be 
output as a voice or sound or displayed on the display 
screen. 

[0028] In FIG. 1 , the control object and content deter- 
mination section 5 determines a control object and the 
content of control based on the candidates for a control 
object determined by the control object candidate deter- 
mination section 3 and the candidates for the content of 
control determined by the control content candidate 
determination section 4, and then effects the deter- 
mined control on the determined control object. The 
control object and the content of control are identified by 
limiting the candidates for a control object and the con- 
tent of control. To limit the candidates, a predetermined 
combination of the candidates for a control object and 
the content of control is selected. For example, if the 
candidates for a control object are the television and air 
conditioner and the candidate for the content of control 



is to "increase the temperature" and if this content is not 
provided for the television but for the air conditioner, 
then the control object will be the air conditioner and the 
content of control will be to increase the set temperature 

5 for the air conditioner. In addition, when as the limitation 
of the candidates, the air conditioner is selected as a 
candidate for a control object based on the person's 
indicating motion and increasing the temperature for the 
air conditioner is selected as a candidate for the content 

10 of control based on the person's uttered word 
"increase", the control object and the content of control 
are not selected if the time interval between the time T1 
at which the indicating motion was monitored and the 
time T2 at which the uttered word "increase" was moni- 

15 tored is larger than or equal to the reference value (for 
example, three seconds). Likewise, if the time T2 pre- 
cedes the time T1 , the control object and the content of 
control are not selected. In this manner, the candidates 
can be limited by taking the time interval or order 

20 between the times T1 and T2 into account. This tech- 
nique can reduce the rate of misjudgment caused by a 
combination of the accidental occurrence of this indicat- 
ing motion and the accidental utterance of this word dur- 
ing conversation. In addition, even if the duration of the 

25 indicating motion is smaller than the reference value (for 
example, equal to or more than 1 second), the control 
object is neither selected. This can reduce the rate of 
misjudgment if this indicating motion is accidentally 
made as a daily motion. This limitation may not be pro- 

3 o vided (if, for example, there is only one candidate or 
multiple control objects are simultaneously controlled) 
when there is no need for limitation. 
[0029] Next, the control object and content determina- 
tion section 5 presents the information on determined 

35 control object and content of control. The determined 
control object and content of control can be presented 
as in the presentation of the candidates for a control 
object or the content of control. 
[0030] In addition, the control object and content 

40 determination section 5 may indicate confirmation, indi- 
cate the needs for reentry, or present information such 
as candidate selections and the disability to determine 
the control object. With the indication of confirmation, 
execution is confirmed for the determined control object 

45 and content of control by uttering the word "OK?", dis- 
playing it on the display screen, or outputting a predeter- 
mined sound. The indication of the needs for reentry 
and of the disability to determine the object urge reentry 
if the control object and content of control cannot be 

so determined easily (for example, the people's predeter- 
mined attribute is ambiguous) by uttering the words 
"Enter data again", displaying them on the display 
screen, or outputting a predetermined sound. With the 
candidate selections, if there are multiple control 

55 objects and contents of control, the selections are dis- 
played on the display screen to urge selection. 
[0031] In addition, the monitoring section 1 monitors 
the attributes of the operator after the control object and 



5 



9 



EP0 919 906 A2 



10 



content determination section 5 has presented informa- 
tion, and the control object and content determination 
section 5 limits the candidates for a control object and 
the content of control, determines a control object and 
the content of control, and presents information again. 
The reduction of the control objects and contents of 
control may be repeated based on the presentation of 
information and the monitoring of the person's attribute 
after presentation. The repetition can reduce the 
number of candidates. To confirm execution after the 
control object and content determination section 5 has 
presented information, the uttered word "Yes" is moni- 
tored as the operator's attribute to control the control 
object. If the selections of candidates are displayed on 
the display screen, they are numbered and the uttered 
words indicating these numbers are monitored to con- 
trol the control object. The control object determination 
section 3 and content determination section 4 may urge 
selection from the candidates. In this case, since the 
person selects a particular object and content from the 
candidates, the control object and content determina- 
tion section 5 determines a control object and the con- 
tent of control based on this action of selection. 
[0032] The operation of the first embodiment of this 
invention configured in the above manner is described 
below. It is assumed that there are three people A, B, 
and C in a room and that there are an air conditioner, a 
television, and a fan in this room. The monitoring sec- 
tion 1 continuously monitors the people's attributes and 
their peripheral environment If A and B point to the tel- 
evision, the operator selection section 2 considers B, 
who preceded A in pointing to the television, to be an 
operator and outputs B's name as a voice. Then, it is 
known that A cannot operate the television whereas B 
can do it, so this apparatus can be controlled without 
confusion despite the presence of the several people. 
[0033] In addition to the outputting of B's name as a 
voice, the operator selection section 2 can present infor- 
mation on the operator by displaying the name of the 
selected person on the display screen, outputting 
music, sound, or an uttered name as sounds associated 
with the person beforehand, displaying on the display 
screen a symbol or characters associated with the per- 
son beforehand, or transmitting a signal to a device car- 
ried by the person. The device carried by the person 
can provide the information to the person by vibrating or 
outputting light or sound. Alternatively, a light may be 
directed to the selected person, or a device such as a 
display may be rotated and directed to the selected per- 
son, or an image of the selected person photographed 
by a camera may be displayed on the display screen. A 
voice saying n What do you want?", or sound or light 
may be output to only the selected person immediately 
after his or her utterance or motion in order to inform the 
people of the selected person. By informing the people 
of the selected operator in this manner, they know who 
can operate the apparatus, and confusion can be 
avoided even if several people attempt to operate it. 



[0034] The control object candidate determination 
section 3 determines candidates for control object 
based on the predetermined attribute of the person 
selected by the operator selection section 2 or his or her 

5 peripheral environment. If B is pointing to the neighbor- 
hood of the television and air conditioner, the television 
and air conditioner are determined as candidates for a 
control object. Even if the air conditioner is away from 
the location to which B is pointing, it is included as a 

10 candidate for a control object if the room temperature, 
which is a peripheral environment, is high. Next, the 
control object candidate determination section 3 
presents information on the determined candidates for a 
control object. The information may be presented by 

ys outputting the names of the control objects as voices or 
sounds or displaying them on the display screen. Alter- 
natively, a light included in the candidate apparatus for 
control object may be turned on, or a sound or voice 
may be output from a speaker included in the candidate 

20 apparatus for a control object. The presentation of the 
information on the candidates for a control object ena- 
bles the operator to check whether a desired object is 
included in the candidates. The selection of the plurality 
of candidates enables the desired control object to be 

25 reliably included in the candidates. 

[0035] The control content candidate determination 
section 4 determines candidates for the content of con- 
trol based on the predetermined attribute of the person 
selected by the operator selection section 2 or his or her 

30 peripheral environment. If B says "Strong" while simul- 
taneously pointing to a certain location, the candidates 
for the content of control will be an increase in wind 
force, cooling or warming performance, or illuminance. 
[0036] Next, the control content candidate determina- 

35 tion section 4 presents information on the determined 
candidates for the content of control. The information 
may be presented by outputting names for the contents 
of control objects as voices or sounds or displaying 
them on the display screen. The presentation of the 

40 information on the candidates for the content of control 
enables the operator to check whether a desired con- 
tent is included in the candidates. The selection of the 
plurality of candidates enables the desired content of 
control to be reliably included in the candidates. 

45 [0037] If there are a large number of control objects 
and contents of control, the information on all of them 
need not be presented. 

[0038] The control object and content determination 
section 5 determines a control object and the content of 

so control based on the candidates for a control object 
determined by the control object determination section 
3 and the candidates for the content of control deter- 
mined by the control content determination section 4, 
and then effects the determined control on the deter- 

55 mined control object. The method described below can 
be used to determine a control object and the content of 
control from their candidates. That is, only a predeter- 
mined combination of a candidate for a control object 



. BNSDOCID: <EP 0919906A2 I > 



11 



EP 0 919 906 A2 



12 



and a candidate for the content of control is adopted. In 
the above case, if the candidates for a control object are 
the television and air conditioner and there are four can- 
didates for the content of control, that is, an increase in 
wind force, cooling and warming performance, and illu- 
minance, the television cannot be combined with the 
candidates for the content of control, so the air condi- 
tioner is selected as a control object and the content of 
control that can be combined with the air conditioner is 
limited to an increase in wind force and cooling and 
warming performance. Moreover, by monitoring the 
temperature or season as the people's environment, the 
increase in heating performance can be excluded as the 
content of control if the temperature exceeds 30 °C or if 
it is in summer. In addition, by recording the history of 
the people's control beforehand, the combinations of 
control objects and contents of control that have not 
been used before can be excluded. 
[0039] In this manner, the control object and content 
determination section 5 determines a control object and 
the content of control based on their candidates and the 
peripheral environment, so the control object and con- 
tent of control can be more reliably identified even if it is 
difficult to identify them separately only from the prede- 
termined attributes of people. In particular, if an ambig- 
uous attribute of a person such as his or her motion or 
posture is used, determinations are difficult even with 
very accurate recognition. The present approach, how- 
ever, enables a desired control object and the desired 
content of control to be selected using the daily motion 
or posture of the person and without forcing him or her 
to make a dear motion. 

[0040] Next, the control object and content determina- 
tion section 5 presents the information on determined 
control object and content of control. The determined 
control object and content of control can be presented 
as in the presentation of the candidates for a control 
object or the content of control. The presentation of the 
information on the control object and the content of con- 
trol enables the operator to check whether a desired 
control object and the desired content of control have 
been selected. 

[0041] In addition, the control object and content 
determination section 5 may indicate confirmation, indi- 
cate the needs for reentry, or present information such 
as candidate selections and the disability to determine 
the control object. With the indication of confirmation, 
execution is confirmed for the determined control object 
and content of control by uttering the word "OK?", dis- 
playing it on the display screen, or outputting a predeter- 
mined sound. The confirmation can prevent erroneous 
control. The indication of the needs for reentry and of 
the disability to determine the object urge reentry if the 
control object and content of control cannot be deter- 
mined easily. In this case, reselection is urged by utter- 
ing the words "Enter data again", displaying them on the 
display screen, or outputting a predetermined sound or 
displaying the selections on the display screen. In the 



above case, the operator is prompted to aurally indicate 
again whether to increase the wind force of the air con- 
ditioner or its cooling performance. Then, the monitoring 
section 1 monitors the attributes of the operator after the 

5 control object and content determination section 5 has 
presented information, and the control object and con- 
tent determination section 5 limits the candidates for a 
control object and the content of control, determines a 
control object and the content of control, and presents 

10 information again. In the above case, when the person 
says "Wind", the control object and content determina- 
tion section 5 determines the content of control as an 
increase in the wind force of the air conditioner and then 
aurally indicates that it will increase the wind force of the 

is air conditioner while transmitting a control signal to the 
conditioner. The reduction of the control objects and 
contents of control may be repeated based on the pres- 
entation of information and the monitoring of the peo- 
ple's attribute after presentation. The repetition can 

20 reduce the number of candidates without misjudg- 
merrts. 

[0042] In this manner, by indicating the needs for 
reentry if the control object or content of control cannot 
be identified, misjudgment caused by a forced judgment 
25 can be prevented and the equipment or information can 
be smoothly controlled using the people's attribute such 
as their motion or posture while permitting the ambiguity 
of such an attribute. 

[0043] Meanwhile in the above embodiments when 

30 the reentry from people is monitored, the voice is used 
as the people's attribute but another attribute (specific 
movement etc.) or combination thereof can be used. For 
example when the control object and content of control 
which are determined by the control object and content 

35 determination section 5 are presented to urge the peo- 
ple to determine, people make circle figure by using 
arm, hand or finger (see FIG.9(1)-(4)), put up 
thumb(see FIG.9(5)) and then the control object and 
content determination section 5 determines the pre- 

40 sented matter as the control object and content of con- 
trol. Further when the people make configuration of "X" 
by arm or f inger(see FIG.9(6),(7)),or shake their hand in 
a horizontal direction(see FIG.9(8)). The control object 
and content determination section 5 does not adopt the 

45 presented matter as the control object and content of 
control but urge reentry. 

[0044] Although in the above example, the pointing 
motion or posture is used as the people's attribute in 
determining an operator and candidates for a control 

so object, other attributes (other i ndicating motions such as 
the direction of the eyes or head and a voice, or motions 
or postures other than the indicating motions) and their 
combinations may be used. If. for example, the people's 
position is used as their attribute, a person such as one 

55 in the next room who is not related to the operation of 
the television is prevented from being mistakenly 
selected for an operator. The use of the people's pos- 
ture as an attribute, for example, enables a sitting per- 



7 



QMCnnnrv ,co r>oioon«ao i 



13 



EP 0 919 906 A2 



14 



son to be given top priority and prevents the accidental 
motion of a person who happens to pass the room from 
being mistakenly selected. In addition, using the peo- 
ple's face, expressions, identity, or age as their attribute, 
those who are limited in the use of the television or who 5 
operate the television inappropriately can be excluded 
from the candidates for an operator (for example, chil- 
dren cannot operate the television after nine p.m.). 
Using the people's physical and mental handicap, sex, 
or physiological condition as their attribute, an appropri- w 
ate person can be given top priority in operation based 
on this attribute. Using the people's belonging, an ordi- 
nary article can act as a remote controller (for example, 
holding a red ball indicates the television), a motion that 
the user can use easily or the user's favorable motion 15 
can be used to select the television, and the top priority 
among several people (who holds the red ball first) for 
operating the television can be visibly presented. 
[0045] In addition, a combination of multiple attributes 
(a voice and the direction of the eyes) enables the 20 
equipment to be smoothly operated without forcing the 
person to make a particular motion. For example, by 
combining the uttered word "television" and the direc- 
tion of the eyes to the television, an erroneous reaction 
to an accidental motion (the uttered word "television" 25 
during conversation) can be prevented. 
[0046] The operator can also be appropriately 
selected by combining the people's attribute with their 
peripheral environment. If, for example, the fan is to be 
operated and if the concentration of a particular gas 30 
(carbon dioxide) in the air is monitored as the people's 
peripheral environment, an appropriate person can be 
selected as an operator for the fan based on the peo- 
ple's sensitivity to the environment (those who have a 
headache when the concentration of carbon dioxide is 35 
high). In the operation of the air conditioner, by monitor- 
ing the temperature, humidity, or air current condition as 
the people's peripheral environment and registering 
their sensitivity to the environment (those who are sen- 
sitive to the heat or cold and whose skin is likely to be 40 
dry in winter) beforehand, a person who is sensitive to 
the heat is given top priority in operating the air condi- 
tioner if the temperature and humidity are high and if 
there are few indoor air currents. If the sound volume of 
the television is to be controlled and if the outdoor 45 
sound, which is the people's peripheral environment, is 
loud, an appropriate person can be selected as an oper- 
ator for the television based on the people's sensitivity 
to the environment (those who have difficulty in hearing 
sound). When the illuminance of the lighting is to be so 
changed, the indoor and outdoor illuminances are mon- 
itored as the people's peripheral environment to enable 
an appropriate person to be selected as an operator for 
the lighting based on the people's sensitivity to the envi- 
ronment (those who have difficulty in reading characters 55 
when dark). If the time is monitored as a peripheral envi- 
ronment, the use time can be limited depending on the 
person. For example, it is possible to prevent children 



from operating the television after nine p.m. 
[0047] Likewise, although the above example uses a 
voice as the people's attribute in determining the candi- 
dates for the content of control, other attributes (other 
motions or postures) and their combinations may be 
used. For example, by using a motion as the people's 
attribute, the equipment can be naturally controlled as in 
conversation with the people. Examples of such a 
motion include applying the forefinger to the front of the 
mouth (see FIG. 10(1)) or plugging the ears (see FIGS. 
10(2) and (3)) to reduce the sound volume of the televi- 
sion, and keeping the hands near the respective ears 
(see FIG. 1 0(4)) to increase the sound volume of the tel- 
evision. Further the reproduction of a video etc. is tem- 
porally stopped by making such configuration of T" with 
using both hands(see FIG. 10(5)) or the switch of TV is 
put off by shaking the hand to present "good bye"(see 
FIG.10(6)). 

[0048] In addition, although, in the above example, the 
attribute used to determine the candidates for a control 
object differs from the attribute used to determine the 
candidates for the content of control, only one attribute 
can be used to determine the candidates for both a con- 
trol object and the content of control. Examples are 
shown below. It is assumed that there are people in a 
room and that there are an air conditioner, a television, 
a fan, an air cleaner, a telephone, a desk, and a bed in 
the room. 

[0049] When a person makes a motion of applying 
forefinger to the front of the mouth or plugging the ears 
with the hands, the control object candidate determina- 
tion section 3 determines as a candidate the television 
that outputs sound, and the control content candidate 
determination section 4 determines the reduction of the 
sound volume as a candidate for the content of control. 
When a person picks up the telephone receiver, the 
control object candidate determination section 3 deter- 
mines as a candidate the television that outputs sound, 
and the control content candidate determination section 
4 determines the reduction of the sound volume as a 
candidate for the content of control. When a person 
uses the hand to fan the face or body (see FIG. 10(7)) 
or says "Hot", the control object candidate determina- 
tion section 3 determines as a candidate the air condi- 
tioner, window or fan that relates to air conditioning, and 
the control content candidate determination section 4 
determines as candidates for the content of control 
switching-on, the reduction of the set temperature, the 
opening of the window, the putting on of switch of the 
fan, and the increasing the volume of window. Further 
people pinch their nose(see FIG. 10(8)) to the control 
object candidate determination section 3 determines 
the air cleaner or window as the candidate, and the con- 
trol content candidate determination section 4 puts on 
the switch of the air cleaner or opens the window. When 
a person says "good-bye" or "bye-bye" during conversa- 
tion, the control object candidate determination section 
3 determines the telephone as a candidate, and the 



8 



:<EP 0919906A2 I > 



15 



EP 0 919 906 A2 



16 



control content candidate determination section 4 deter- 
mines the disconnection of the telephone line as candi- 
dates for the content of control. When a person sits in a 
chair and opens a book or holds something to write 
with, the control object candidate determination section 
3 determines as a candidate the lighting in the room and 
the lighting attached to the desk, and the control content 
candidate determination section 4 determines switch- 
ing-on as candidates for the content of control. When a 
person has been sleeping in bed over a specified period 
of time, the control object candidate determination sec- 
tion 3 determines the lighting in the room as a candi- 
date, and the control content candidate determination 
section 4 determines switching-off as candidates for the 
content of control. Further when people shut a light by 
holding up their hand or a member above their eyes(see 
FIG.10(9)),or when people indicate such expression as 
if light dazzled their eyes, or they say "dazzling" , the 
ocntrol object candidate determination section 3 deter- 
miens the light source in a room as a candidate and the 
control content candidate determination section 4 deter- 
mines the making the light dark or switching off as a 
candidate. 

[0050] As described above, the first embodiment of 
this invention enables the people's daily attributes to be 
used to control the equipment smoothly without the use 
of a remote controller. In addition, even if there are sev- 
eral people or apparatuses, this embodiment can 
reduce misjudgment and execute control without the 
needs for complicate indicating motions. 
[0051] FIG. 2 is a block diagram showing a second 
embodiment of the invention. In this figure, the configu- 
ration of this embodiment and the operation of each 
section are the same as in the first embodiment. 
According to the second embodiment, however, the 
monitoring section 1, operator selection section 2, con- 
trol object candidate determination section 3, control 
content candidate determination section 4, and control 
object and content determination section 5 are attached 
to an apparatus to be controlled. If, for example, a tele- 
vision incorporates each of these sections, the control 
object candidate determination section 3 determines 
whether the television is a candidate. Based on the peo- 
ple's predetermined attribute, for example, the uttered 
words "Switch the channel", the control content candi- 
date determination section 4 determines the switching 
of the channel as a candidate for the content of control, 
and since the switching of the channel is included in the 
contents of control for the television, the control object 
and content determination section 5 switches the chan- 
nel. 

[0052] Thus, the second embodiment of this invention 
not only provide the same effects as those of the first 
embodiment but also only requires the control object 
candidate determination section 3 of the apparatus to 
determine whether the apparatus has been selected, 
thereby reducing the amount of processing required to 
determine candidates for a control object compared to 



the first embodiment. It is also advantageous that 
despite the arbitrary movement of the apparatus, the 
processing executed by the apparatus to determine 
candidates for a control object does not need to be 
5 changed. 

[0053] FIG. 3 is a block diagram showing a third 
embodiment of this invention. In this figure, the configu- 
rations of apparatuses to be controlled 1, 2, and 3 and 
the operation of each section are the same as in the 

10 equipment to be controlled in the second embodiment 
(the control objects 1, 2, and 3 are, for example, a tele- 
vision, an air conditioner, and a fan, respectively). The 
third embodiment, however, includes multiple appara- 
tuses to be controlled. The monitoring section 1 outputs 

is monitored contents to a communication network, the 
operator selection section 2 outputs information on a 
selected attribute of the people to the communication 
network, the control object candidate determination 
section 3 outputs information on determined candidates 

20 for a control object (whether the apparatus to which this 
section belongs is a candidate or the degree to which 
this apparatus is considered to be a candidate) to the 
communication network, the control content candidate 
determination section 4 outputs information on the 

25 determined content of control to the communication net- 
work, and the control object and content determination 
section 5 outputs information on the control object and 
content of control to the communication network. 
[0054] The operator selection section 2 selects an 

30 operator based on information obtained from the com- 
munication network and information from the monitoring 
section 1 in order to present information on the operator, 
and the control object candidate determination section 
3 determines candidates for a control object based on 

35 the information obtained from the communication net- 
work and the information from the monitoring section in 
order to present information on the candidates for a 
control object. The control content candidate determi- 
nation section 4 determines candidates for the content 

40 of control based on the information obtained from the 
communication network and the information from the 
monitoring section in order to present information on the 
candidates for the content of control, and the control 
object and content determination section 5 presents 

45 information on the control object and content of control 
based on the information obtained from the communica- 
tion network and the candidates for a control object 
determined by the control object candidate determina- 
tion section 3 and the candidates for the content of con- 

so trol determined by the control content candidate 
determination section 4, and then effects the control on 
the control object. This embodiment differs from the 
second embodiment in that due to the presence of the 
plurality of control apparatuses, control unintended by 

55 the operator may be provided when each apparatus 
individually determines a control object and the content 
of control. For example, if a person points to the inter- 
mediate point between the television (a control object 1) 



9 

PP nQ1QQOAA9 i «► 



17 



EP 0 919 906 A2 



18 



and the air conditioner (a control object 2) and says 
"Switch on", both the television (the control object 1) 
and the air conditioner (the control object 2) are 
switched on even if only the television (the control object 

1) to be operated. 

[0055] In this case, by obtaining, for the other control- 
led apparatuses, information on the candidates for a 
control object and the content of control and on the con- 
trol object and the content of control, the candidates for 
a control object and the content of control can be limited 
as in the first embodiment. When, for example, the con- 
trol object candidate determination section of the televi- 
sion (the control object 1) outputs "television" to the 
communication network as a candidate for a control 
object and the control object candidate determination 
section of the air conditioner (the control object 2) out- 
puts "air conditioner" to the communication network as 
a candidate for a control object, the control object and 
content determination section of the television (the con- 
trol object 1) determines a control object and presents 
the corresponding information based on the information 
on candidates for a control object for the television and 
the information on candidates for a control object for the 
air conditioner obtained through the communication net- 
work. In this case, both the television and air conditioner 
are candidates, so the television presents information 
prompting reentry. The control object and content deter- 
mination section of the air conditioner (the control object 

2) executes a similar processing. By outputting informa- 
tion to the communication network, not only the candi- 
dates for a control object, information on the degree to 
which the apparatus is considered to be a candidate (for 
example, the television: 10; the air conditioner: 5), the 
control object and content determination sections of the 
controlled apparatuses 1 and 2 compare the degrees to 
which the apparatus is considered to be a candidate in 
order to determine as a control object the apparatus 
having a larger degree (in this case, the television). The 
candidates for the content of control is similarly proc- 
essed by exchanging information between different con- 
trolled apparatuses. 

[0056] In addition, by obtaining information on the 
operator from the communication network, only the 
apparatus to which the operator located closest can 
respond without causing multiple apparatuses to 
respond to the motion of one person (such as a pointing 
motion). If, for example, the operator selected by the 
operator selection section of the television (the control 
object 1) is identical to the operator selected by the 
operator selection section of the air conditioner (the 
control object 2) from the communication network, the 
above procedure is executed to determine either the tel- 
evision (the control object 1) or the air conditioner (the 
control object 2) as a control object. In addition, if the 
operators are different, the television (the control object 
1) and the air conditioner (the control object 2) can exe- 
cute processing based only on the attributes of the 
respective selected operator. In addition, information 



output by the monitoring section of the television (the 
control object 1) can be used by the control object can- 
didate determination section, control content candidate 
determination section, or operator selection section of 

s the air conditioner (the control object 2). This has an 
effect of providing information that cannot be monitored 
using only the air conditioner (information on blind 
spots). In addition, information output by the control 
object and content determination section of the air con- 

w ditioner (the control object 2) can be displayed by the 
television (the control object 1). In this case, the output 
means (images, voices, or light) of each controlled 
apparatus can be shared among the controlled appara- 
tuses. 

is [0057] The communication network may be wired 
(connected via network, telephone, or power lines) or 
wireless (using a communication medium such as elec- 
tric waves, infrared rays, or supersonic waves), or may 
be a mixture of a wired and a wireless networks. 

20 [0058] Thus, the third embodiment of this invention 
does not only provide the effects of the first and second 
embodiments but can also consistently and smoothly 
control the equipment based on the people's predeter- 
mined attribute even if each controlled apparatus 

25 includes a mechanism for individually determining an 
operator, a control object, and the content of control. 
[0059] FIG. 4 is a block diagram showing a fourth 
embodiment of this invention. In this figure, the opera- 
tion of each section is similar to that in the first embodi- 

30 ment. This embodiment, however, does not include the 
operator selection section 2, and determines candi- 
dates for a control object and the content of control for 
all the people. If, for example, there are N people, the 
determination of candidates for a control object and the 

35 determination of candidates for the content of control 
are executed N times (may be concurrently executed). 
The control object and content determination section 5 
determines a control object and the content of control 
based on the candidates for a control object and the 

40 content of control determined for all the people in order 
to execute control. The majority rule is used to deter- 
mine a control object and the content of control. If, for 
example, there are N people and if half or more of the 
people issue an instruction for a decrease in the set 

45 temperature for the air conditioner, control is provided 
such that the temperature for the air conditioner is 
reduced. In addition, if the control objects and the con- 
tents of control for the respective people are consistent, 
the corresponding controls may be simultaneously pro- 

50 vided. Thus, the fourth embodiment of this invention 
does not only provide the effects of the first embodiment 
but can also operate the equipment taking operations 
for the several people simultaneously into considera- 
tion. To execute similar processing using remote con- 

55 trollers, as may remote controllers as the people must 
always be prepared that must be operated by each per- 
son. This invention enables equipment or information to 
be controlled in such a way as to reflect instructions 



10 



BNSDOCID: <EP 0919906A2 I > 



19 



EP 0 919 906 A2 



20 



from several people without the use of a large number of 
remote controllers. 

[0060] FIG. 5 is a block diagram showing a fifth 
embodiment of this invention. In this figure, the opera- 
tion of each section is similar to that in the fourth 
embodiment. In the fifth embodiment, however, the 
monitoring section 1 , the control object candidate deter- 
mination section 3, control content determination sec- 
tion 4, and control object and content determination 
section 5 are attached to a controlled apparatus (for 
example, the air conditioner). With this configuration, 
the fifth embodiment of this invention does not only pro- 
vide the effects of the fourth embodiment but also 
requires the control object candidate determination sec- 
tion 3 of each apparatus to only determine whether that 
apparatus has been selected, thereby reducing the 
amount of processing required to determine candidates 
for a control object, compared to the fourth embodi- 
ment. This embodiment is also advantageous in that 
despite the free movement of the position of the equip- 
ment, the processing for determining candidates for a 
control object for the equipment does not need to be 
changed. 

[0061] FIG. 6 is a block diagram showing a sixth 
embodiment of this invention. In this figure, the opera- 
tion of each section is similar to that in the first embodi- 
ment. According to the sixth embodiment, however, the 
control object candidate determination section 3 deter- 
mines candidates for a control object for all the people, 
and the control content candidate determination section 
4 determines candidates for the content of control for all 
the people, tf, for example, there are N people, the 
determination of candidates for a control object and the 
determination of candidates for the content of control 
are executed N times (may be concurrently executed). 
[0062] The operator selection section 2 determines an 
operator based on the candidates for a control object 
and the content of control for the N people. The control 
object and content determination section 5 determines 
a control object and the content of control based on the 
candidates for a control object and the content of control 
for the operator selected by the operator selection sec- 
tion 2. 

[0063] The configuration of the sixth embodiment of 
this invention can provide effects similar to those of the 
first embodiment. In addition, by selecting an operator 
after candidates for a control object and the content of 
control have been determined, a procedure based on 
the candidates for a control object and the content of 
control can be executed by, for example, avoiding 
selecting as an operator those for which candidates for 
a control object and the content of control cannot be 
determined due to the ambiguity of their predetermined 
attribute (for example, an indicating motion). 
[0064] FIG. 7 is a block diagram showing a seventh 
embodiment of this invention. In this figure, the opera- 
tion of each section is similar to that in the sixth embod- 
iment. In the seventh embodiment, however, the 



monitoring section 1, the control object candidate deter- 
mination section 3, control content determination sec- 
tion 4, operation selection section 2, and control object 
and content determination section 5 are attached to a 

5 controlled apparatus (for example, the air conditioner). 
With this configuration, the seventh embodiment of this 
invention does not only provide the effects of the sixth 
embodiment but also requires the control object candi- 
date determination section 3 of each apparatus to only 

10 determine whether that apparatus has been selected, 
thereby reducing the amount of processing required to 
determine candidates for a control object, compared to 
the sixth embodiment. This embodiment is also advan- 
tageous in that despite the free movement of the posi- 

T5 tion of the equipment, the processing for determining 
candidates for a control object for the equipment does 
not need to be changed. 

[0065] FIG. 8 is a block diagram showing an eighth 
embodiment of this invention. In this figure, the opera- 
te tion of each section is almost similar to that in the first 
embodiment. According to the seventh embodiment, 
however, the control object candidate determination 
section 3 determines candidates for a control object for 
all the people, and the control content candidate deter- 
25 mination section 4 determines candidates for the con- 
tent of control for all the people. If, for example, there 
are N people, the determination of candidates for a con- 
trol object and the determination of candidates for the 
content of control are executed N times (may be concur- 
30 rently executed). The control object and content deter- 
mination section 5 determines a control object and the 
content of control from the candidates for a control 
object and the content of control for the operator 
selected by the operator selection section 2. The config- 
35 uration shown in the eighth embodiment of this inven- 
tion can provide effects similar to those of the first 
embodiment. 

[0066] Although the first to eighth embodiments have 
been described by assuming that the method is imple- 
40 merited indoors, they can be adapted for the outdoor 
operation of equipment. In addition to the operation of 
equipment, these embodiments can be adapted for the 
manipulation of information such as manipulation of 
objects on a screen or the control of screens. 
45 [0067] This invention can be realized using hardware, 
or software on a computer, or their mixture. 
[0068] This invention is also a program stored medium 
for memorizing such programs that are utilized for real- 
izing all or part of operations of the control method 
so according to the above mentioned present invention. 
[0069] As is apparent from the above description, this 
invention is a control method for enabling equipment to 
be smoothly controlled using people's daily attributes 
and without forcing the people to make complicate pre- 
ss determined motions, by taking into account the pres- 
ence of several people or multiple apparatuses and the 
ambiguity of the people's motions and postures that 
prevent equipment or information from being smoothly 



11 



21 EPOS 

operated using the people's attributes. 
[0070] In addition, a control object and the content of 
control can be reliably identified by determining a plural- 
ity of candidates for the a control object and the content 
of control based on the people's predetermined 
attributes and using both information on candidates for 
a control object and the content of control to limit the 
total number of candidates. 

[0071 ] Furthermore, required information can be pro- 
vided for the people while a control object and the con- 
tent of control can be smoothly determined by 
presenting the people with information on the operator 
and the content of candidates determined by equip- 
ment, and prompting reentry and further observing the 
people's attributes if a control object and the content of 
control cannot be determined. 

[0072] Further according to the present invention 
information can be presented by a sound device as well 
as by a screen, even if there are a visually handicapped 
person and a sound handicapped person, they can con- 
firm the information. Therefore the present invention is 
useful for welfare work. 

[0073] Further when the monitor is executed intermit- 
tently, the consume power can be largely reduced. 
Therefore the present invention is useful for protection 
of the environment of the earth. 

Claims 

1. A control method for monitoring a person's 
attributes and based on the results, in predeter- 
mined content of the control, controlling equipment 
to be controlled. 

2. A control method according to Claim 1 character- 
ized by monitoring said person's peripheral environ- 
ment and also using these results to execute said 
control. 

3. A control method according to Claim 1 or 2 charac- 
terized by executing said monitoring constantly or 
intermittently. 

4. A control method according to Claim 1 , 2, or 3 char- 
acterized in that based on the results of said moni- 
toring, an operator is selected, followed by the 
selection of an object to be controlled and the con- 
tent of the control. 

5. A control method according to Claim 4 character- 
ized in that if said control object, said operator or, 
said content of control is not determined, such 
result information that it is not determined is pre- 
sented. 

6. A control method according to Claim 5 character- 
ized in that at least one of light, characters, voice, 
sound, or vibration is used to present the result 



19 906 A2 22 

information. 

7. A control method according to Claim 1 , 2, or 3 char- 
acterized in that based on the results of said moni- 

5 toring. candidates for an object to be controlled and 
for the content of the control are first selected, and 
in that an operator is then selected, followed by the 
determination of an object to be controlled and the 
content of the control. 

10 

8. A control method according to Claim 1 , 2, or 3 char- 
acterized by monitoring several people at a time. 

9. A control method according to Claim 1, 2, 3, or 4 
75 characterized in that said equipment includes multi- 
ple apparatuses. 

10. A control method according to any of Claims 1 to 9 
characterized by presenting said people with infor- 

20 mation on the results of said monitoring and execut- 
ing said control based on responses and 
instructions issued by said people in response to 
the presentation. 

25 11. A control method according to Claim 10 character- 
ized in that said presented information is at least 
one of information on the candidates for said peo- 
ple, information on the candidates for said object 
equipment, and information on the content of the 

30 control. 

1 2. A control method according to Claim 1 , 2, or 3 char- 
acterized by simply presenting said people with 
information of the results of said monitoring while 

35 automatically executing said control. 

1 3. A control method according to Claim 1 , 2, or 3 char- 
acterized in that information on the results of said 
monitoring is not presented to said people while 

40 automatically executing said control. 

14. A control method according to Claim 12 or 13 char- 
acterized in that as information on the results of 
said monitoring, a plurality of candidates are pro- 

45 vided for at least one of the people, equipment, and 
content of control, and in that based on the candi- 
dates, said control is automatically executed. 

15. A control method according to Claim 1 , 2, or 3 char- 
so acterized in that a control means for executing said 

control is integrated into the object equipment. 

16. A control method according to Claim 15 character- 
ized in that a said control means is integrated into 

55 each of a plurality of apparatuses, and in that the 
plurality of control means exchange various infor- 
mation using a communication path so that said 
control is executed based on the results of the infor- 



12 



: <EP 0919906A2 I > 



23 



EP 0 919 906 A2 



24 



mation exchange. 

17. A control method according to Claim 16 character- 
ized in that said control means executes said con- 
trol by transmitting the results of its own monitoring 
to the other control means while adding the con- 
tents obtained from the other control means 
through said communication path. 

18. A control method according to any of Claims 1 to 1 7 
characterized in that the attributes of said people 
are their motions. 

1 9. A control method according to any of Claims 1 to 1 8 
characterized in that said control comprises chang- 
ing control parameters for the object equipment or 
turning them on and off. 

20. A control method according to Claim 1 character- 
ized in that as the people's predetermined 
attributes, the method detects a motion in which 
they apply their forefinger to the front of their mouth, 
in that the object equipment outputs sound based 
on said motion, and in that as a change in the value 
of a control parameter, the volume of said equip- 
ment is reduced. 

21. A control method according to Claim 1 character- 
ized in that as the people's predetermined 
attributes, the method detects a motion in which 
they plug their ears, in that the object equipment 
includes equipment outputting sound based on said 
motion and equipment controlling the volume 
based on said motion, and in that as a change in 
the value of a control parameter, the volume of said 
equipment is reduced. 

22. A control method according to Claim 1 character- 
ized in that as the people's predetermined 
attributes, the method detects a motion in which 
they keep their hand next to their ear, in that the 
object equipment includes equipment outputting 
sound based on said motion and equipment con- 
trolling the volume based on said motion, and in 
that as a change in the value of a control parameter, 
the volume of said equipment is increased. 

23. A control method according to Claim 1 character- 
ized in that as the people's predetermined 
attributes, the method detects a motion in which 
they use their hand to fan their face or part of their 
body, in that the object equipment includes equip- 
ment controlling air conditioning based on said 
motion and a window, and in that as a change in the 
value of a control parameter, the volume of said 
equipment is reduced. 

24. A control method according to Claim 1 character- 



ized by continuously or intermittently monitoring the 
attributes of a single or plural people, determining 
candidates for a control object and the content of 
control based on the detection of a people's prede- 
5 termined attribute, and determining a control object 
and the content of control based on said candidates 
for a control object and the content of control. 

25. A control method according to Claim 1 character- 
w ized by continuously or intermittently monitoring the 
attributes of a single or plural people and their 
peripheral environment, determining candidates for 
a control object and the content of control based on 
the detection of a people's predetermined attribute 
is and their peripheral environment, and determining 
a control object and the content of control based on 
said candidates for a control object and the content 
of control. 

20 26. A control method according to Claim 25 character- 
ized in that the peripheral environment comprises 
at least one of temperature, humidity, illuminance, 
air current, time, or sound. 

25 27. A control method according to any of Claims 24 to 
26 characterized by outputting to a communication 
network, information obtained through said moni- 
toring, and/or outputting to the communication net- 
work, candidates for a control object and the 

30 content of control, and/or outputting to the commu- 
nication network, a control object and the content of 
control. 

28. A control method according to Claim 27 character- 
35 ized by continuously or intermittently monitoring 

said communication network, and determining can- 
didates for a control object and the content of con- 
trol based on information from the communication 
network and the detection of a people's predeter- 
40 mined attribute. 

29. A control method according to Claim 27 or 28 char- 
acterized by continuously or intermittently monitor- 
ing said communication network, and determining a 

45 control object and the content of control based on 
information from the communication network and 
the candidates for a control object and the content 
of control. 

so 30. A control method according to any of Claims 24 to 

29 characterized in that said candidates for a con- 
trol object are used as control objects and in that 
the candidates for the content of control are used 
as the content of control. 

55 

31. A control method according to any of Claims 24 to 

30 characterized in that based on the detection of 
the attributes of several people, candidates for a 



13 



25 



EP 0 919 906 A2 



26 



control object and the content of control are deter- 
mined for each of said people, and in that based on 
said candidates for a control object and the content 
of control, a control object and the content of control 
are determined. 

32. A control method according to Claim 31 character- 
ized in that using the candidates for a control object 
and the content of control determined for each per- 
son, a control object and the content of control are 
determined based on the majority rule. 

33. A control method according to any of Claims 24 to 
30 characterized in that based on the detection of a 
people's predetermined attribute, an operator and 
candidates for a control object and the content of 
control are determined, and in that based on said 
operator and said candidates for a control object 
and the content of control, a control object and the 
content of control are determined. 

34. A control method according to Claim 33 character- 
ized in that based on the detection of the attributes 
of several people, candidates for a control object 
and the content of control are determined for each 
of said people, in that an operator is selected based 
on said candidates for a control object and the con- 
tent of control, and in that based on the attributes of 
the selected operator, candidates for a control 
object and the content of control are determined. 

35. A control method according to any of Claims 24 to 
30 characterized in that an operator is selected 
based on the detection of a people's predetermined 
attribute, and in that based on the attributes of the 
selected operator, candidates for a control object 
and the content of control are determined. 

36. A control method according to any of Claims 33 to 
35 characterized in that information on the selected 
operator is output to a communication network. 

37. A control method according to any of Claims 27 to 
29 and 36 characterized in that the communication 
network comprises a wireless or a wired communi- 
cation network. 

38. A control method according to Claims 33 to 36 
characterized in that the selected operator is pre- 
sented as information. 

39. A control method according to any of Claims 24 to 
38 characterized in that information is presented 
based on the candidates for a control object and the 
content of control. 

40. A control method according to Claim 39 character- 
ized by presenting at least one of candidates for a 



control object, a control object, candidates for the 
content of control, the content of control, the indica- 
tion of confirmation, the indication of reentry, the 
selections of the candidates, or inability to deter- 
5 mine an object. 

41. A control method according to any of Claims 38 to 

40 characterized in that at least one of light, charac- 
ters, voice, sound, or vibration is used to present 

10 information. 

42. A control method according to any of Claims 38 to 

41 characterized in that based on the results of the 
monitoring of the people's attribute after the pres- 

15 entation of information, the candidates for a control 
object are reduced and/or the candidates for the 
content of control are reduced and/or a control 
object is determined and/or the content of control 
are determined and/or information is presented. 

20 

43. A control method according to any of Claims 24 to 

42 characterized in that based on predetermined 
references, the candidates for a control object are 
reduced and/or the candidates for the content of 

25 control are reduced and/or a control object is deter- 
mined and/or the content of control are determined. 

44. A control method according to Claim 43 character- 
ized in that based on the coherency of candidates 

30 for a control object and the candidates for the con- 
tent of control, the candidates for a control object 
are reduced and/or the candidates for the content 
of control are reduced and/or a control object is 
determined and/or the content of control are deter- 

35 mined. 

45. A control method according to Claim 43 or 44 char- 
acterized in that using at least one of the time inter- 
val between a first time at which a predetermined 

40 attribute used to determine candidates for a control 
object is detected and a second time at which a pre- 
determined attribute used to determine candidates 
for the content of control object is detected; the 
temporal order of said first and second times; or the 

45 duration of the detection of said predetermined 
attribute, the candidates for a control object are 
reduced and/or the candidates for the content of 
control are reduced and/or a control object is deter- 
mined and/or the content of control are determined. 

50 

46. A control method according to Claims 24 to 45 
characterized in that the content of control are to 
operate the equipment and/or to manipulate the 
information. 

55 

47. A control method according to any of Claims 24 to 
46 characterized in that at least one of people's 
positions, postures, expressions, motions, voices, 



14 



BNSDOCID: <EP 0919906A2 I > 



27 



EP 0 919 906 A2 



28 



physiological conditions, identities, forms, weights, 
sexes, ages, physical and mental handicaps, or 
belongings is used as the people's attribute. 

48. A control method according to Claims 24 to 47 
characterized in that the people's attribute used to 
determine candidates for a control object differs 
from the people's attribute used to determine candi- 
dates for the content of control. 

49. A control method according to Claims 24 to 47 
characterized in that the people's attribute used to 
determine candidates for a control object is the 
same as that used to determine candidates for the 
content of control. 

50. A control method according to Claim 49 character- 
ized in that as the people's attribute, the method 
detects a motion in which they apply their forefinger 
to the front of their mouth, in that the candidates for 
a control object include equipment outputting sound 
based on said motion and equipment controlling the 
volume based on said motion, and in that the candi- 
dates for the content of control include control for 
reducing the volume. 

51. A control method according to Claim 49 character- 
ized in that as the people's attribute, the method 
detects a motion in which they plug their ears, in 
that the candidates for a control object include 
equipment outputting sound based on said motion 
and equipment controlling the volume based on 
said motion, and in that the candidates for the con- 
tent of control include control for reducing the vol- 
ume. 

52. A control method according to Claim 49 character- 
ized in that as the people's predetermined 
attributes, the method detects a motion in which 
they use their hand to fan their face or part of their 
body, in that the candidates for a control object 
include equipment controlling air conditioning 
based on said motion and a window, and in that the 
candidates for the content of control include the 
reduction of a set temperature or ventilation. 

53. A control method according to Claim 49 character- 
ized in that as the people's attribute, the method 
detects a motion in which they Keep their hand next 
to their ear, in that the candidates for a control 
object include equipment outputting sound based 
on said motion and equipment controlling the vol- 
ume based on said motion, and in that the candi- 
dates for the content of control include control for 
increasing the volume. 

54. A program stored medium for memorizing such pro- 
grams that are utilized for realizing all or part of 



operations of the control method according to any 
one of claims 1 to 53. 



5 



10 



15 



20 



25 



30 



35 



40 



45 



50 



15 



EP 0 919 906 A2 



i 



People 



Present 
information 

> 




.<<<<<<<<< 
>>>>>>>>>>>> 
<<<<<<<<<<<<< 
>>>>>>>>>•* • ■ * 

)><>( Jment ,<„ 



< 

< 



c < < < < 

> > > > 

c < < * 

> > 



Control 
section 



Monitoring 
section 1 



operator 

selection 
section 2 



Control 
object 
candidate 
determination 
sectAog 3 



Control 
content 
candidate 
determination 
section 4 



J 



Control object 
and content 
determination 
section 5 



Content of 
control 



Controlled! 
equipmenl 

vw* — - 




16 



BNSOOCID: <EP 0919906A2 I > 



EP 0 919 906 A2 



People 



.<<<<<<<<< 
>>>>>>>>>>>> 
<<<<<<<<<<<<< 

Environ- 



Present 
information 




U u 



Controlled 
equipment: 



Control section* 
X attached to thelfi MMHHB^H 
controlled equip^ entj|«|^ 



Monitoring 
section 1 



„ V 

Operator 

selection 
section 2 



V 



Control 
object 

candidate 
determination 
section 3 



Control object 
and content 

determination 
section 5 




Control 
content 
candidate 
determination 
section 4 



17 



EP 0 919 906 A2 



§ 

-P 



O 



-p 
c 

0) 
CO 

cd 
u 

04 



CD -P 

»H 0) 

o g 

-P-H 

C a 

O CT 
O CD 



0) 



o 

C -P 
O G 
•«-4 O 
-P o 

09 



i-4 O 
O -P 

o o 

-p 
-p 

(0 



-p 
c 

CD 

a 

cr 



r— 1 








c 








o 




o 




-H 




•H 




-P 




«P 




O 




o 




0) 




0) 




CD 












€U 








CO 




c 






CM 


•H 




U 




H 




O 


c 


O 




AJ 


o 


-P 




03 


-H 


-H 






•P 


CJ 




CD 


O 


o 




Ou 


a> 






O 


0Q 






T3 






CD 






f-l 






r-H 




c 


O 




o 


M 




•H 


-P 




•P 


C 




o 


o 




<D 


o 


CO 


CO 
S3 


he 


-p 

C 


O 


-p 




•H 


C 


s 


-P 




cu 




^5 


•H 


O 


-P 


53 


CD 


-H 


D 1 


X 

w 




CD 














§ 


o 






•H 






-P 




-P 


O 




U 


CD 




CD 


CO 




•-I 






CD 






CQ 


C 




CM 


•H 




M 


M 




O C 


O 




-P o 


-P 




ft) *H 


•H 




M -P 


C 




CD O 


o 




04 CD 


32 




O CO 








CD 




i-H 




r-4 




ction 
jontro 




CD « 


CM 


CO CD 


ent 


•h a 


it 


•P.r* 












cr 




CD 







Communication network 



CD CD 



o cu 

-P S3 

-P CT 

at cd 











on 




ion 




ecti 




lect: 




CD 




CD 








CO 


CM 












o 


c 






■p 


o 


o 




<a 




-p 




M 


-p 


-H 




CD 




C3 






s 


O 






CO 



-p c 

CD (d o 
-P C-H 
, 4-> <0 -H -P 

n o -h C a) 

«P -P "5 CD CO 
OOflflJC 

UO Qt) o 



a- 

CD A) O 
•P CS *H 
—1 flJ.H -P 
O -P g O 
J-l O "H M CD 
•P CD *0 CD CO 

O JQ <d CD €3 

u o o *q_o 



a 




o 




"H 




4-> 








cd c 








•p 6 




o 5 




CD CD 








CD 




O T3 


m 


rH 


C 


O (3 


o 


M CD 


■H 


i -p 4J -P 


C C 


O 


O O 


CD 


O O 


CQ 



-p 

c o 

CD M 
-P -P 

c c 
O o 
u o 







CD 




i-H 




r— 1 




c 2 




Ǥ 




0> ° 




CO CD 

§^ 


ent 






1- 






•H 








CT 


s-s 


CD 







18 



BNSDOCID: <EP 0919906A2 I > 



EP 0 919 906 A2 



i S . -4- 



people 



Present 
information 



✓ >>>>>/. 
.<<<<<<<<<< 
+ >>>>>>>>>>>> 
<<<<<<<<<<<<<< 
>>>>>>>>>>• • • • 

<<<<<<<<<< p nv { rn « < 
>>>>>>>>>> >r«nviron- > 

v>\ J > > V V > > >V > > > > > > 
v^S < < ^r^S < < < < 



Control 
section 




> > 

< < 
>•> 

< < 



N 



> > > > 
< < * 

> > ' 



Repeat N times 



Monitoring 
section 1 



Control 
object 
candidate 
determinat ion 
section 3 , — 



V ■ 

Control 
content 
candidate 
determination 
section 4 



J 



Control object 
and content 
determination 
section 5 



Content 
of control 



>Contro 

equipment! 

'EST 1 * ■ 




19 



EP 0 919 906 A2 



*>>>>>?. 
<<<<<<<<<< . 
*>>>>>>>>>>>> 

People >>>>>>>>>> • • 

>SSSSSSSSS < Environ-S 



Present 
information 




s < n> < > < » < A ment X 
vt AvA >,«>v>v 



20 



EP 0 919 906 A2 



People 



Present 
information 



*>>>>>*. 
<<<<<<<<<< . 
>>>>>>>>>>> > 
<<<<<<<<<<<<<< 

> > >>>>>>> > E nviiron T 

ment < < 
> 

<<<<<< 
> > > > > 



< < 

> > 

< < 



N 



< < < < 
> > > > 



Control section 



Repeat N times 



Monitoring 
section 1 



Control 
object 
candidate 
determination 



v , 

Control 
content 
candidate 
determination 
section 4 



Operator 
selection 
section 2 



Control object 
and content 
determination 
section 5 



Content of control 



v 



illiisiiii 

Control led! 
equipment! 



21 



EP 0 919 906 A2 



People 



i 



*>>>>> + . 
. <<<<<<<<<< . 
>>>>>>>>>>>> 
<<<<<<<<<<<<<< 



Present 
information 



<e <JS^ V? < < y\ ,jL <<<<*< 4 
X I > ( 1 > > > > > > > > 




22 

8NSDOCID:<EP 0919906A2 I > 



EP 0 919 906 A2 



F i & 



People' 



Present 
information 



. < < < < 
* > > > > 
<<<<<< 
> >> > > > 
* < < < < < < 
>>>>>> 

ceo 




> > > . 

< < < < * 
>>>>>> 

« < <_ < < < . 

>%Environ- 
if"V mentV 
/■ ^ < < < < 



N 



Control 
section 
~T£epea€" 
N times 



u 



> > > > 



Monitoring 
section 1 



3 T 



Control 
object 

candidate 
determi- 
nation 
section 3 



Control 
content 
candidate 
determi- 
nation 
section 4 



j 



V 



Operator 
selection 
section 2 



Control object 
and content 
determination 
section 5 



Content of control 




23 



EP 0 919 906 A2 







24 



BNSDOCID: <EP 0919906A2 1 > 



EP 0 919 906 A2 




25 



> 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
10 October 2002 (10.10.2002) 




PCT 



(10) International Publication Number 

WO 02/080531 A2 



(51) International Patent Classification 7 : H04N 5/44 

(21) International Application Number: PCT/IB02/00929 

(22) International Filing Date: 19 March 2002 (19.03.2002) 

(25) Filing Language: English 

(26) Publication Language: English 



(30) Priority Data: 

09/821,183 



29 March 2001 (29.03.2001) US 



Prof. Holstlaan 6, NL-5656 AA Eindhoven (NL). COL- 
MENAREZ, Antonio; Prof. Holstlaan 6, NL-5656 AA 
Eindhoven (NL). 

(74) Agent: GROENENDAAL, Antonius, W„ M.; Interna- 
tionaal Octrooibureau B.V., Prof. Holstlaan 6, NL-5656 
AA Eindhoven (NL). 

(81) Designated States (national)'. CN, JP, KR. 

(84) Designated States (regional): European patent (AT, BE, 
CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, 
NL, PT, SE, TR). 

Published: 

— without international search report and to be republished 
upon receipt of that report 



(71) Applicant: KONINKLIJKE PHILIPS ELECTRON- 
ICS N.V. [NL/NL]; Groenewoudseweg 1, NL-5621 BA 
Eindhoven (NL). 

For two-letter codes and other abbreviations, refer to the "Guid- 

(72) Inventors: GUTTA, Srinivas; Prof. Holstlaan 6, anceNotes on Codes and Abbreviations" appearing at the begin- 
NL-5656 AA Eindhoven (NL). TRAJKOVIC, Miroslav; ning of each regular issue of the PCT Gazette. 



400 



C USB EVENT MONITORING^N 
^ PROCESS *S 



gjg (54) Title: METHOD AND APPARATUS FOR CONTROLLING A MEDIA PLAYER BASED ON USER ACTIVITY 

(57) Abstract: A media player controller is disclosed that 
^= monitors user activity and automatically controls a media 

player in response to predefined events. The disclosed media 
player controller includes one or more audio/visual mage 
capture devices focused on one or more users. The captured 
audio and video information is processed by the media player 
controller to identify one or more predefined events. A 
number of rules can be utilized to define various user events, 
such as when the user has left the room, is on the telephone 
or is otherwise paying attention to the media player. Each 
rule contains one or more conditions, and, optionally, a 
corresponding action-item that should be performed when 
the rule is satisfied. Upon detection of a predefined event, 
the corresponding action, if any, is performed by the media 
player controller. 



< 



ID 
00 

o 



OBTAIN INPUTS FROM AUDIO/VISUAL 
CAPTURE DEVICES) 150 



-405 



IDENTIFY USa(S) THAT ARE PRESENTT^ 



410 



ANALYZE AUDIO/VISUAL INFORMATION USING AUDIO 
AND /OR VIDEO CONTENT ANALYSIS TECHNIQUES 



•420 



NO, 



DOES AUDIO/VIDEO CONTENT 
ANALYSIS DETECT A PREDEFINED 
EVENT AS DEFINED IN USER EVENT 
DATABASE? 



1 


YES 

r 


PERFORM ACTION INDICATED IN PROFILE. 
IF ANY, OR USER EVENT DATABASE 





-440 



END 



o *- 

WO 02/080531 PCT/EB02/00929 



Method and apparatus for controlling a media player based on user activity 



The present invention relates to methods and apparatus for controlling media 
players, and more particularly, to a method and apparatus for automatically controlling a 
media player based on user activity. 

The consumer marketplace offers a wide variety of electronic devices, such as 
5 televisions, stereo systems and personal computers, that provide an ever-growing number of 
features intended to increase the convenience and capabilities of these devices. Most 
entertainment devices, for example, have an associated remote control device that allows the 
user to adjust a number of the device settings remotely. For example, a user can adjust the 
program channel, volume and other settings of a television using a remote control, in a well- 
1 0 known manner. 

While remote controls and other additional features have greatly improved the 
convenience of such entertainment devices, they still require the affirmative action of the user 
to manipulate the remote control (or another input mechanism associated with the device) to 
indicate the manner in which the particular device settings should be adjusted. Thus, if the 
15 remote control is not readily available, or the user does not wish to move closer to the device 
itself, the user may still be unable to conveniently adjust one or more settings in a desired 
manner. 

It has been observed that there is often a predictable relationship between 
certain user activity and a corresponding manner in which the settings of an electronic device 

20 should be adjusted. For example, when the telephone rings while a user is watching 

television, the user often responds by manually adjusting the volume or activating the mute 
feature of the television. There is currently no mechanism, however, that provides an 
indication to an electronic device of such user activity. A need therefore exists for a media 
player controller that monitors user activity and automatically adjusts a media player in 

25 response to predefined events. A further need exists for a media player controller that 

employs a rule-base to define user activities or events, as well as the corresponding response 
that should be implemented to adjust device settings. 

Generally, a method and apparatus are disclosed for monitoring user activity 
and automatically controlling a media player in response to predefined events. The disclosed 



BNSDOCID: <WO 



02080531 A2 I > 



* 



WO 02/080531 



PCT/IB02/00929 



media player controller includes one or more audio/visual capture devices focused on one or 
more users. The obtained audio and video information is processed by the media player 
controller to identify one or more predefined events. 

According to one aspect of the invention, a number of rules define various 
5 user activities or events, such as when the user has left the room, is on the telephone or is 
otherwise not paying attention to the media player. Each rule contains one or more 
conditions, and, optionally, a corresponding action-item that should be performed when the 
rule is satisfied to adjust one or more settings of the media player. Upon detection of a 
predefined event, the corresponding action, if any, is performed by the media player 
10 controller. 

A more complete understanding of the present invention, as well as further 
features and advantages of the present invention, will be obtained by reference to the 
following detailed description and drawings. 



15 

Fig. 1 illustrates a media player controller in accordance with the present 

invention; 

Fig. 2 illustrates a sample table from the user profile of Fig. 1 in accordance 
with the present invention; 
20 Fig. 3 illustrates a sample table from the user event database of Fig. 1 ; and 

Fig. 4 is a flow chart describing an exemplary user event monitoring process 
embodying principles of the present invention. 



25 Fig. 1 illustrates a media player controller 100 in accordance with the present 

invention. As shown in Fig. 1, the media player controller 100 includes one or more 
audio/visual capture devices 150-1 through 150-N (hereinafter, collectively referred to as 
audio/visual capture devices 150) that are focused on one or more user(s) 140 of a media 
player 160. 

30 Each audio/visual capture device 150 may be embodied, for example, as a 

fixed or pan-tilt-zoom (PTZ) camera for capturing image or video information, or one or 
more microphones for capturing audio information (or both). The audio and video 
information generated by the audio/visual capture devices 150 are processed by the media 
player controller 100, in a manner discussed below in conjunction with Fig. 4, to identify one 

2 



WO 02/080531 



PCI7EB02/00929 



or more predefined user activities or events. In one implementation, the present invention 
employs a user profile 200 and event rules database 300, discussed further below in 
conjunction with Figs. 2 and 3, that record a number of user preferences and rules, 
respectively. The rules define various events that should initiate an adjustment of one or more 
5 settings of the media player 160. 

The user activities defined by each rule may be detected by the media player 
controller 100 in accordance with the present invention. As discussed further below, each 
rule contains one or more criteria that must be satisfied in order for the rule to be triggered, 
and, optionally, a corresponding action-item that should be performed by the media player 

10 controller 100 to adjust one or more settings of the media player 160 when the predefined 
criteria for initiating the rule is satisfied. At least one of the criteria for each rule is a 
condition detected in the audio or video information generated by the audio/visual capture 
devices 150 using audio or vision-based techniques, in accordance with the present invention. 

Upon detection of such a predefined user activity or event, the corresponding 

1 5 action, if any, is performed by the media player controller 100. Typically, the corresponding 
action is the issuance of a command to the media player 160 to adjust one or more settings. 
The commands include, for example, mute, record, volume adjust, change program channel, 
power save mode and live pause. 

As discussed further below in conjunction with Figs. 2 and 3, the user 

20 preferences and rules recorded in the user profile 200 and event rules database 300 may 

include one or more criteria that is dependent on external information, such as information 
from an optional electronic program guide 130 or caller id (identification) device 170. For 
example, the corresponding action-item that is performed by the media player controller 100 
in response to a given user activity may be dependent on features of a program, as indicated 

25 in the electronic program guide 130. Similarly, the corresponding action-item that is 

performed by the media player controller 100 in response to the media player controller 100 
detecting that the telephone is ringing may be dependent on the identity of the caller, as 
indicated by the caller id device 170. 

As shown in Fig. 1, and discussed further below in conjunction with Fig. 4, the 

30 media player controller 100 also contains a user event monitoring process 400. Generally, the 
user event monitoring process 400 processes the audio information or images obtained by the 
audio/visual capture devices 150 and detects one or more events defined in the event rules 
database 300. 



3 



BNSDOCID:<WO 02080531 A2 I > 



WO 02/080531 PCT/IB02/00929 



The media player controller 100 may be embodied as any computing device, 
such as a personal computer or workstation, that contains a processor 120, such as a central 
processing unit (CPU), and memory 110, such as RAM and/or ROM. Alternatively, the 
media player controller 100 may be embodied as an application specific integrated circuit 
5 (ASIC) (not shown) that is included, for example, in a television, set-top terminal or another 
electronic device. 

Fig. 2 illustrates an exemplary table of the user profile(s) 200 that records 
various preferences of each user. As shown in Fig. 2, the user profile 200 is comprised of a 
plurality of records, such as records 205-208, each associated with a different user. For each 
10 user, the user profile 200 identifies the user in field 250 and the corresponding media 
preferences of the user, if any, in field 260. 

For example, the user preferences recorded in record 205 for the user, John 
Smith, indicates that the user likes to pause the media player 160 when the telephone rings, 
unless the call is from a particular telephone number, upon which the volume of the media 
1 5 player 160 is lowered. Likewise, the user preferences recorded in record 206 for the user, 
Jane Smith, indicates that the user likes to lower the volume of the media player 160 when 
the telephone rings, unless the current selected program is a top-5 program, upon which a 
record command is sent to the media player 160. Thus, the preferences in record 205 are 
dependent upon information from the caller id device 170, and the preferences in record 206 
20 are dependent upon information from the electronic program guide 130. 

Generally, the user preferences recorded in the user profile(s) 200 can be 
obtained explicitly, i.e., from survey responses, or implicitly, by monitoring how a given user 
responds to a given set of circumstances. Thereafter, a rule can be established that defines the 
given set of circumstances and the correspond action item that should be performed. 
25 Fig. 3 illustrates an exemplary table of the event rules database 300 that 

records each of the rules that define various user activities or events. Each rule in the event 
rules database 300 includes predefined criteria specifying the conditions under which the rule 
should be initiated, and, optionally, a corresponding action item that should be triggered 
when the criteria associated with the rule is satisfied. Typically, the action item defines one 
30 or more adjustments to the settings of the media player 1 60 that should be performed when 
the rule is triggered. 

As shown in Fig. 3, the exemplary event rules database 300 maintains a 
plurality of records, such as records 305-3 1 1, each associated with a different rule. For each 
rule, the event rules database 300 identifies the rule criteria in field 350 and the 

4 



WO 02/080531 



PCT7IB02/00929 



corresponding action item, if any, in field 360. For example, the rule recorded in record 306 
is an event corresponding to the user remaining out of the room (or away from the vicinity of 
the media player 160). As indicated in field 350, the rule in record 306 is triggered when the 
user remains out of the room for a predefined minimum time interval. As indicated in field 
5 360, the corresponding action consists of sending a command to place the media player 160 
in a power save mode. 

Fig. 4 is a flow chart describing an exemplary user event monitoring process 
400. The user event monitoring process 400 processes audio or video information (or both) 
obtained from the audio/visual capture devices 150 and detects one or more events defined in 

1 0 the event rules database 300. The exemplary user event monitoring process 400 is a general 
process illustrating the broad concepts of the present invention. As shown in Fig. 4, the user 
event monitoring process 400 initially obtains one or more inputs from the audio/visual 
capture devices 150 during step 405. Thereafter, the user event monitoring process 400 
optionally identifies the user(s) that are present during step 410, for example, using a 

1 5 biometric evaluation of the audio or visual information obtained from the audio/visual 
capture device 150. A user identification is particularly useful when the media player 
controller 100 permits user-specific media preferences set forth in the user profile(s) 200 to 
control over the general rules set forth in the event rules database 300. 

Thereafter, the audio/visual information is analyzed during step 420 using 

20 audio and/or video content analysis (V CA) techniques. For a detailed discussion of suitable 
audio content analysis techniques, see, for example, Silvia Pfeiffer et al., "Automatic Audio 
Content Analysis," Proc. ACM Multimedia 96, 21-30, Boston, MA. (Nov. 1996), 
incorporated by reference herein. For a detailed discussion of suitable VGA techniques, see, 
for example, Nathanael Rota and Monique Thonnat, "Video Sequence Interpretation for 

25 Visual Surveillance," in Proc. of the 3d IEEE Int'l Workshop on Visual Surveillance, 59- 67, 
Dublin, Ireland (July 1, 2000), and Jonathan Owens and Andrew Hunter, "Application of the 
Self-Organizing Map to Trajectory Classification, 5 in Proc. of the 3d IEEE Int'l Workshop on 
Visual Surveillance, 77-83, Dublin, Ireland (July 1, 2000), incorporated by reference herein. 
Generally, the audio content analysis and VCA techniques are employed to recognize various 

30 features in the signals obtained by the audio/visual capture devices 1 50. 

A test is performed during step 430 to determine if the audio/video content 
analysis detects a predefined event, as defined in the event rules database 300. It is noted that 
the general rules set forth in the event rules database 300, as analyzed during step 430, may 
be modified in accordance with the specific user preferences set forth in the user profile 200. 



BNSDOCID: <WO 0208053 1A2 I 



WO 02/080531 



PCT/EB02/00929 



If it is determined during step 430 that the audio/video content analysis does not detect a 
predefined event, then program control returns to step 410 to continue monitoring user 
activities in the manner discussed above. 

If, however, it is determined during step 430 that the audio/video content 
5 analysis detects a predefined event, then the event is processed during step 440 as indicated 
in field 260 of the user profile 200, if any, for the identified user or field 360 of the event 
rules database 300. Program control then terminates (or returns to step 410 and continues 
monitoring user activities in the manner discussed above). 

In a further variation, the retention schedule for a given program that is 
1 0 recorded in accordance with the present invention can be determined, for example, by a 
weight assigned to the program by a user or by a recommendation score assigned by a 
program recommender. 

A "computer program" is to be understood to mean any software product 
stored on a computer-readable medium, such as a floppy-disk, downloadable via a network, 
1 5 such as the Internet, or marketable in any other manner. 

It is to be understood that the embodiments and variations shown and 
described herein are merely illustrative of the principles of this invention and that various 
modifications may be implemented by those skilled in the art without departing from the 
scope and spirit of the invention. 



6 



WO 02/080531 



PCT/IB02/00929 



CLAIMS: 



1 . A method for controlling a media player (1 60), comprising 

analyzing at least one of audio and video information (150) focused on a user 
(140) to identify at least one predefined user activity; and 

performing a predefined action item (360) to automatically adjust said media 
5 player (160) when said user activity is identified. 

2. The method of claim 1, further comprising 

establishing at least one rule (305-3 1 1) defining the predefined user activity, 
wherein said rule (305-3 1 1) comprises at least one condition (350) and the action item (360) 
10 to be performed to automatically adjust said media player (160) when said rule (305-31 1) is 
satisfied; and 

wherein analyzing at least one of audio and video information (150) is focused 
on a user (140) to identify said condition (350). 

15 3 . The method of claim 1 or 2, wherein said user activity suggests that said user 

(140) is not paying attention to said media player (160) and said action item (360) is the 
issuance of at least one of commands to pause said media player (160), or to begin recording, 
or to enter a power save mode. 

20 4. The method of claim 1 or 2, wherein said user activity is a predefined gestural 

command and said action item (360) is the issuance of a corresponding command to said 
media player (160). 

5. A system for controlling a media player (160), comprising 

25 a memory (1 10) for storing computer readable code; and 

a processor (120) operatively coupled to said memory (1 10), said processor 
(120) configured to 

analyze at least one of audio and video information (150) focused on a user 
(140) to identify at least one predefined user activity; and 



^ BNSDOCID: <WO 02080531 A2 I > 



WO 02/080531 PCT/IB02/00929 

perform a predefined action item (360) to automatically adjust said media r 
player (160) when said user activity is identified. 

6. The system of claim 5, wherein the processor (120) is further configured to 
5 establish at least one rule (305-3 1 1) defining the predefined user activity, 

wherein said rule (305-3 1 1) comprising at least one condition (350) and the action item (360) 
to be performed to automatically adjust said media player (160) when said rule (305-31 1) is 
satisfied; and 

analyze at least one of audio and video information (150) focused on a user 
10 (140) to identify said condition (350). 

7. The system of claim 5 or 6, wherein said user activity suggests that said user 
(140) is not paying attention to said media player (160) and said action item (360) is the 
issuance of at least one of commands to pause said media player (160), or to begin recording, 

15 or to enter a power save mode. 

8. The system of claim 5 or 6, wherein said user activity is a predefined gestural 
command and said action item (360) is the issuance of a corresponding command to said 
media player (160). 

20 

9. A computer program product enabling a programmable device when executing 
said computer program product to function as the system for controlling a media player (160) 
as defined in any of claims 5 to 8. 



WO 02/080531 



31 * 
PCT/IB02/00929 







Q >- 


II 







CD 




CC 



6t CD 
^ CC 

S o 

->CD 



>- 

OC 

CD CD 



CD 
CO 

co cr» 

CD i— 
CD 
DC 
CL. 



CO 



ggi 



OC 
LU 
CO 



CO 

LU uj 
=! CO 

S 2 



CO 




CO 
UJ 




CD 




1— CD 




^ az 




LU CL. 




Cu CD 

OC ^= 




LU QC 




CO CD 




ZD 








CD 









<---■ 



6- 



Q 




LU 

or cd 






SI 






CD 





1 ■ I 

^ I QC LU 

Q ZD | — CD 
ZD 92 OL. LU ^ 
<C>^CD^ 




V 




D^. I CC LU 

2;<=5CD 

cd zd 1 — =c: _ 

^ CO LU lo 
<><Q 



1 I 

CD 



77 




1/3 



BNSDOCID: <WO 02080531 A2 I > 



WO 02/080531 



PCT/IB02/00929 



USER PROFILE(S) -200 





USER ID 


USER MEDIA PREFERENCES 


205 


JOHN SMITH 


SEND LIVE PAUSE COMMAND TO MEDIA PLAYER 
UPON PHONE RINGING, UNLESS CALL IS FROM 
(777)555-1212, THEN LOWER VOLUME 


206 


JANE SMITH 


SEND LOWER VOLUME COMMAND TO MEDIA 
PLAYER UPON PHONE RINGING, UNLESS 
SELECTED PROGRAM IS ATOP-5 PROGRAM, THEN 
SEND RECORD COMMAND 


207 


• • • 




208 


ROBERT SMITH 


SEND LIVE PAUSE COMMAND TO MEDIA PLAYER 
IF USER LEAVES ROOM; 

SEND LIVE PAUSE COMMAND TO MEDIA PLAYER 
IF USER IS NOT PAYING ATTENTION TO MEDIA 
PRESENTATION 



FIG. 2 



EVENT RULES DATABASE -300 





RULE CRITERIA 
350 


ACTION 
360 


305 


USER LEAVES ROOM 


SEND LIVE PAUSE COMMAND 


306 


USER REMAINS OUT OF ROOM FOR 1 
PREDEFINED TIME 


SEND POWER SAVE MODE COMMAND 


307 


USER IS NOT PAYING ATTENTION TO 
MEDIA PRESENTATION 


SEND LIVE PAUSE COMMAND 


308 


USER IS ON TELEPHONE 


SEND RECORD COMMAND 


309 


USER IS SPEAKING TO SOMEONE 
ELSE IN ROOM 


SEND VOLUME ADJUST COMMAND 


310 


• • • 




311 


USER ISSUES PREDEFINED GESTURAL COMMAND 
TO CHANGE A SETTING OF MEDIA PLAYER 


SEND CORRESPONDING COMMAND 



FIG. 3 

2/3 



WO 02/080531 



PCT/EB02/00929 




OBTAIN INPUTS FROM AUDIO/VISUAL 
CAPTURE DEVICE(S) 150 




f 


IDENTIFY USER(S)1 


fHAT ARE PRESENT 



V 



405 



410 



ANALYZE AUDIO/VISUAL INFORMATION USING AUDIO 
AND /OR VIDEO CONTENT ANALYSIS TECHNIQUES 



420 



NO 




DOES AUDIO/VIDEO CONTENT 
ANALYSIS DETECT A PREDEFINED 
EVENT AS DEFINED IN USER EVENT 
DATABASE? 




YES 



PERFORM ACTION INDICATED IN PROFILE, 
IF ANY, OR USER EVENT DATABASE 




•440 



END 



FIG, 



. 3/3 



BNSOOCIDkWO 02080531A2 I >